query performance
ROSfs: A User-Level File System for ROS
Xu, Zijun, Wen, Xuanjun, Song, Yanjie, Yin, Shu
We present ROSfs, a novel user-level file system for the Robot Operating System (ROS). ROSfs interprets a robot file as a group of sub-files, with each having a distinct label. ROSfs applies a time index structure to enhance the flexible data query while the data file is under modification. It provides multi-robot systems (MRS) with prompt cross-robot data acquisition and collaboration. We implemented a ROSfs prototype and integrated it into a mainstream ROS platform. We then applied and evaluated ROSfs on real-world UAVs and data servers. Evaluation results show that compared with traditional ROS storage methods, ROSfs improves the offline query performance by up to 129x and reduces inter-robot online data query latency under a wireless network by up to 7x.
- North America > United States > District of Columbia > Washington (0.05)
- Asia > China > Shanghai > Shanghai (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
Multi-agent Databases via Independent Learning
Zhang, Chi, Papaemmanouil, Olga, Hanna, Josiah P., Akella, Aditya
Machine learning is rapidly being used in database research to improve the effectiveness of numerous tasks included but not limited to query optimization, workload scheduling, physical design, etc. Currently, the research focus has been on replacing a single database component responsible for one task by its learning-based counterpart. However, query performance is not simply determined by the performance of a single component, but by the cooperation of multiple ones. As such, learning based database components need to collaborate during both training and execution in order to develop policies that meet end performance goals. Thus, the paper attempts to address the question "Is it possible to design a database consisting of various learned components that cooperatively work to improve end-to-end query latency?". To answer this question, we introduce MADB (Multi-Agent DB), a proof-of-concept system that incorporates a learned query scheduler and a learned query optimizer. MADB leverages a cooperative multi-agent reinforcement learning approach that allows the two components to exchange the context of their decisions with each other and collaboratively work towards reducing the query latency. Preliminary results demonstrate that MADB can outperform the non-cooperative integration of learned components.
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.66)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.52)
A Learned Index for Exact Similarity Search in Metric Spaces
Tian, Yao, Yan, Tingyun, Zhao, Xi, Huang, Kai, Zhou, Xiaofang
Indexing is an effective way to support efficient query processing in large databases. Recently the concept of learned index, which replaces or complements traditional index structures with machine learning models, has been actively explored to reduce storage and search costs. However, accurate and efficient similarity query processing in high-dimensional metric spaces remains to be an open challenge. In this paper, we propose a novel indexing approach called LIMS that uses data clustering, pivot-based data transformation techniques and learned indexes to support efficient similarity query processing in metric spaces. In LIMS, the underlying data is partitioned into clusters such that each cluster follows a relatively uniform data distribution. Data redistribution is achieved by utilizing a small number of pivots for each cluster. Similar data are mapped into compact regions and the mapped values are totally ordinal. Machine learning models are developed to approximate the position of each data record on disk. Efficient algorithms are designed for processing range queries and nearest neighbor queries based on LIMS, and for index maintenance with dynamic updates. Extensive experiments on real-world and synthetic datasets demonstrate the superiority of LIMS compared with traditional indexes and state-of-the-art learned indexes.
- North America > United States (0.14)
- Asia > China > Guangdong Province > Guangzhou (0.04)
- Oceania > Australia > Queensland (0.04)
- (2 more...)
Evolution of ML Fact Store
At Netflix, we aim to provide recommendations that match our members' interests. To achieve this, we rely on Machine Learning (ML) algorithms. ML algorithms can be only as good as the data that we provide to it. This post will focus on the large volume of high-quality data stored in Axion -- our fact store that is leveraged to compute ML features offline. We built Axion primarily to remove any training-serving skew and make offline experimentation faster. We will share how its design has evolved over the years and the lessons learned while building it.
Top Snowflake Interview Questions
Snowflake is a cloud data warehouse provided as a software-as-a-service (SaaS). It consists of unique architecture to handle multiple aspects of data and analytics. Snowflake sets itself apart from all other traditional data warehouse solutions with advanced capabilities like improved performance, simplicity, high concurrency and cost-effectiveness. Snowflake's shared data architecture physically separates the computation and storage which is not possible by the traditional offerings. It streamlines the process for businesses to store and analyze massive volumes of data using cloud-based tools.
- Information Technology > Security & Privacy (1.00)
- Information Technology > Services (0.69)
Oracle accelerates MySQL HeatWave queries with machine learning
Taking aim at competitors including Amazon Aurora and Snowflake, Oracle has enhanced the MySQL HeatWave in-memory query accelerator in the Oracle Cloud's MySQL Database Service by leveraging advanced machine learning. But Oracle insists the improvements do not mean the MySQL Database Service is encroaching on its flagship Oracle Database. The company on August 10 rolled out MySQL Autopilot, a component of HeatWave that uses advanced machine learning techniques to accelerate query performance and scalability. MySQL HeatWave works with the MySQL Database Service in Oracle Cloud Infrastructure (OCI) to accelerate performance for analytics and mixed OLTP (online transaction processing) and OLAP (online analytical processing) workloads. Included with HeatWave at no extra charge, Autopilot automates aspects of achieving high query performance at scale including provisioning, data loading, query execution, and failure handling.
Amazon Timestream - Time series is the new black
From the earliest days of my career, data, and the insights that we draw from that data, have always held a special place in my heart. At a company like Amazon, getting millions of items delivered to customers on demanding timeframes, and running massive world-wide data centers to host our cloud-based service offerings are all dependent on our ability to understand, process, and analyze vast quantities of data. This is of course true in almost every industry – the ability to leverage data can be the difference between your business thriving or dying. As a technology leader, what concerns me about this is that many companies aren't investing in the right kind of technologies that will enable them to be successful here. Take for example databases, many are still using traditional relational databases for everything, simply because they don't know any other way.
- Information Technology > Artificial Intelligence (0.95)
- Information Technology > Databases (0.91)
The RLR-Tree: A Reinforcement Learning Based R-Tree for Spatial Data
Gu, Tu, Feng, Kaiyu, Cong, Gao, Long, Cheng, Wang, Zheng, Wang, Sheng
Despite the success of these learned indices in improving the performance Learned indices have been proposed to replace classic index structures of some types of queries, they still have various limitations, like B-Tree with machine learning (ML) models. They require e.g., they can only handle spatial point objects and limited types to replace both the indices and query processing algorithms currently of spatial queries, some only return approximate query results, deployed by the databases, and such a radical departure is and they either cannot handle updates or need a periodic rebuild likely to encounter challenges and obstacles. In contrast, we propose to retain high query efficiency (Detailed discussions are in Section a fundamentally different way of using ML techniques to 2). These limitations, together with the requirement that the improve on the query performance of the classic R-Tree without learned indices need a replacement of the index structures and the need of changing its structure or query processing algorithms.